Research conducted by Hui-Yi Lin, PhD, Associate Professor of Biostatistics at LSU Health New Orleans School of Public Health, developed two novel statistical methods for detecting interactions of genetic variants associated with cancers or other complex diseases. These two methods are published in Bioinformatics, one of the leading journals in the field.
“Selecting a good statistical method is essential to conducting a solid study,” notes Dr. Lin. “It is like choosing a good recipe for preparing a meal. If the recipe is bad, even good ingredients won’t result in a good meal. The same concept can be applied to genetic association studies."
During the past decade, genome-wide association studies (GWAS) have successfully identified many inherited genetic variants or single nucleotide polymorphisms (SNPs) associated with cancers or other complex diseases. SNPs, pronounced, Snips, are genetic variants in single DNA building blocks called nucleotides. For example, when one nucleotide is replaced by another inside a gene or in a region that determines gene function, disease can develop or worsen. According to the US Library of Medicine, “SNPs occur normally throughout a person’s DNA. They occur once in every 300 nucleotides on average, which means there are roughly 10 million SNPs in the human genome.”Leslie Capo
Office: 504-568-4806
Cell: 504-452-9166
lcapo@lsuhsc.edu
SNP interaction pattern identifier (SIPI): an intensive search for SNP–SNP interaction patterns
AA9int: SNP interaction pattern search using non-hierarchical additive model setThe conventional approach to test SNP interactions is to use a hierarchical interaction model with two main effects plus their interaction with both SNPs as an additive inheritance mode.
To identify significant SNP-SNP interactions, Lin developed two statistical methods – the SNP Interaction Pattern Identifier (SIPI) and the Additive-Additive 9 Interaction-model approach (AA9int). The SNP Interaction Pattern Identifier (SIPI) approach evaluates 45 SNP interaction patterns by considering three major factors: model structure (hierarchical and non- hierarchical model), genetic inheritance mode (dominant, recessive and additive), and mode coding direction. Her study demonstrated that SIPI can detect novel SNP interactions, which cannot be detected using the conventional statistical approach. These interactions can predict disease outcome better than individual SNPs.
“SIPI is statistically powerful but suffers from a large computation burden,” says Lin. “It requires tremendous computational resources.”
Lin applied these two methods to identify SNP-SNP interactions in the angiogenesis pathway associated with prostate cancer aggressiveness using the data from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium cohort, the largest prostate cancer consortium in the world and the supercomputer from the Louisiana Optical Network Infrastructure (LONI).
“Applying SIPI to the prostate cancer PRACTICAL consortium data with approximately 21,000 patients, we found that the four SNP pairs in EGFR-EGFR, EGFR-MMP16, and EGFR-CSF1 were associated with prostate cancer aggressiveness with the exact or similar pattern in the discovery and validation sets,” Lin says.Lin's study findings demonstrated that SIPI can detect more meaningful interaction patterns compared to the conventional approach or other existing methods.
“The SNP interaction pairs, identified using these two novel approaches (AA9int or SIPI), can be applied to build risk prediction models or genetic risk scores for cancers or other complex diseases,” she adds. “These identified gene-gene or SNP-SNP interactions can provide insight to understand the biological mechanism of cancer development and may improve cancer diagnosis accuracy and reduce cancer-related deaths in the future.”Lin is the principal investigator of an NIH/NCI-funded R21 grant to study gene-gene interactions associated with prostate cancer aggressiveness. These two new statistical methods are being applied to analyze genetic data for this project. The preliminary results are promising.